智能论文笔记

Concordance based Survival Cobra with regression type weak learners

Rahul Goswami , Arabin Kumar Dey

分类： (统计)机器学习 | 人工智能 | 机器学习

2022-09-24

在本文中，我们通过合并的回归策略来预测条件生存函数。我们将弱的学习者视为不同的随机生存树。我们建议在右审查设置中最大化和解以找到最佳参数。我们探索两种方法，一种通常的生存眼镜蛇和基于一致性指数的新型加权预测指标。我们提出的配方使用两种不同的规范，例如Max-Norm和Frobenius Norm，从测试数据集中的查询点找到了一组邻近性预测。我们通过三个不同的现实数据集实现来说明我们的算法。

translated by 谷歌翻译

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

Teven Le Scao , Angela Fan , Christopher Akiki , Ellie Pavlick , Suzana Ilić , Daniel Hesslow , Roman Castagné , Alexandra Sasha Luccioni , François Yvon , Matthias Gallé

分类：自然语言处理

2022-11-09

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions. While these capabilities have led to widespread adoption, most LLMs are developed by resource-rich organizations and are frequently kept from the public. As a step towards democratizing this powerful technology, we present BLOOM, a 176B-parameter open-access language model designed and built thanks to a collaboration of hundreds of researchers. BLOOM is a decoder-only Transformer language model that was trained on the ROOTS corpus, a dataset comprising hundreds of sources in 46 natural and 13 programming languages (59 in total). We find that BLOOM achieves competitive performance on a wide variety of benchmarks, with stronger results after undergoing multitask prompted finetuning. To facilitate future research and applications using LLMs, we publicly release our models and code under the Responsible AI License.

translated by 谷歌翻译

Robust Artificial Delay based Impedance Control of Robotic Manipulators with Uncertain Dynamics

Udayan Banerjee , Bhabani Shankar Dey , Indra Narayan Kar , Subir Kumar Saha

分类：机器人

2022-08-18

在本文中，提出了针对动力学不确定性的机器人操纵器提出的人工延迟阻抗控制器。控制定律将超级扭曲算法（STA）类型的二阶切换控制器通过新颖的广义过滤跟踪误差（GFTE）统一延迟估计（TDE）框架。虽然时间延迟的估计框架可以通过估算不确定的机器人动力学和相互作用力来从状态和控制工作的近期数据中估算不确定的机器人动力学和相互作用力来准确建模机器人动力学，但外部循环中的第二阶切换控制法可以在时间延迟估计的情况下提供稳健性（TDE）由于操纵器动力学的近似而引起的误差。因此，拟议的控制定律试图在机器人最终效应变量之间建立所需的阻抗模型，即在存在不确定性的情况下，在遇到平滑接触力和自由运动期间的力和运动。使用拟议的控制器以及收敛分析的两个链接操纵器的仿真结果显示出验证命题。

translated by 谷歌翻译

Explainable and High-Performance Hate and Offensive Speech Detection

Marzieh Babaeianjelodar , Gurram Poorna Prudhvi , Stephen Lorenz , Keyu Chen , Sumona Mondal , Soumyabrata Dey , Navin Kumar

分类：自然语言处理 | 机器学习

2022-06-26

信息通过社交媒体平台的传播可以创造可能对弱势社区的环境和社会中某些群体的沉默。为了减轻此类情况，已经开发了几种模型来检测仇恨和冒犯性言论。由于在社交媒体平台中检测仇恨和冒犯性演讲可能会错误地将个人排除在社交媒体平台之外，从而减少信任，因此有必要创建可解释和可解释的模型。因此，我们基于在Twitter数据上培训的XGBOOST算法建立了一个可解释且可解释的高性能模型。对于不平衡的Twitter数据，XGBoost在仇恨言语检测上的表现优于LSTM，Autogluon和ULMFIT模型，F1得分为0.75，而0.38和0.37分别为0.37和0.38。当我们将数据放到三个单独的类别的大约5000个推文中时，XGBoost的性能优于LSTM，Autogluon和Ulmfit；仇恨言语检测的F1分别为0.79和0.69、0.77和0.66。 XGBOOST在下采样版本中的进攻性语音检测中的F1得分分别为0.83和0.88、0.82和0.79，XGBOOST的表现也比LSTM，Autogluon和Ulmfit更好。我们在XGBoost模型的输出上使用Shapley添加说明（SHAP），以使其与Black-Box模型相比，与LSTM，Autogluon和Ulmfit相比，它可以解释和解释。

translated by 谷歌翻译

e-Inu: Simulating A Quadruped Robot With Emotional Sentience

Abhiruph Chakravarty , Jatin Karthik Tripathy , Sibi Chakkaravarthy S , Aswani Kumar Cherukuri , S. Anitha , Firuz Kamalov , Annapurna Jonnalagadda

分类：机器人 | 机器学习

2023-01-03

Quadruped robots are currently used in industrial robotics as mechanical aid to automate several routine tasks. However, presently, the usage of such a robot in a domestic setting is still very much a part of the research. This paper discusses the understanding and virtual simulation of such a robot capable of detecting and understanding human emotions, generating its gait, and responding via sounds and expression on a screen. To this end, we use a combination of reinforcement learning and software engineering concepts to simulate a quadruped robot that can understand emotions, navigate through various terrains and detect sound sources, and respond to emotions using audio-visual feedback. This paper aims to establish the framework of simulating a quadruped robot that is emotionally intelligent and can primarily respond to audio-visual stimuli using motor or audio response. The emotion detection from the speech was not as performant as ERANNs or Zeta Policy learning, still managing an accuracy of 63.5%. The video emotion detection system produced results that are almost at par with the state of the art, with an accuracy of 99.66%. Due to its "on-policy" learning process, the PPO algorithm was extremely rapid to learn, allowing the simulated dog to demonstrate a remarkably seamless gait across the different cadences and variations. This enabled the quadruped robot to respond to generated stimuli, allowing us to conclude that it functions as predicted and satisfies the aim of this work.

translated by 谷歌翻译

NaQ: Leveraging Narrations as Queries to Supervise Episodic Memory

Santhosh Kumar Ramakrishnan , Ziad Al-Halah , Kristen Grauman

分类：计算机视觉

2023-01-02

Searching long egocentric videos with natural language queries (NLQ) has compelling applications in augmented reality and robotics, where a fluid index into everything that a person (agent) has seen before could augment human memory and surface relevant information on demand. However, the structured nature of the learning problem (free-form text query inputs, localized video temporal window outputs) and its needle-in-a-haystack nature makes it both technically challenging and expensive to supervise. We introduce Narrations-as-Queries (NaQ), a data augmentation strategy that transforms standard video-text narrations into training data for a video query localization model. Validating our idea on the Ego4D benchmark, we find it has tremendous impact in practice. NaQ improves multiple top models by substantial margins (even doubling their accuracy), and yields the very best results to date on the Ego4D NLQ challenge, soundly outperforming all challenge winners in the CVPR and ECCV 2022 competitions and topping the current public leaderboard. Beyond achieving the state-of-the-art for NLQ, we also demonstrate unique properties of our approach such as gains on long-tail object queries, and the ability to perform zero-shot and few-shot NLQ.

translated by 谷歌翻译

Statistical Machine Translation for Indic Languages

Sudhansu Bala Das , Divyajoti Panda , Tapas Kumar Mishra , Bidyut Kr. Patra

分类：自然语言处理

2023-01-02

Machine Translation (MT) system generally aims at automatic representation of source language into target language retaining the originality of context using various Natural Language Processing (NLP) techniques. Among various NLP methods, Statistical Machine Translation(SMT). SMT uses probabilistic and statistical techniques to analyze information and conversion. This paper canvasses about the development of bilingual SMT models for translating English to fifteen low-resource Indian Languages (ILs) and vice versa. At the outset, all 15 languages are briefed with a short description related to our experimental need. Further, a detailed analysis of Samanantar and OPUS dataset for model building, along with standard benchmark dataset (Flores-200) for fine-tuning and testing, is done as a part of our experiment. Different preprocessing approaches are proposed in this paper to handle the noise of the dataset. To create the system, MOSES open-source SMT toolkit is explored. Distance reordering is utilized with the aim to understand the rules of grammar and context-dependent adjustments through a phrase reordering categorization framework. In our experiment, the quality of the translation is evaluated using standard metrics such as BLEU, METEOR, and RIBES

translated by 谷歌翻译

Argoverse 2: Next Generation Datasets for Self-Driving Perception and Forecasting

Benjamin Wilson , William Qi , Tanmay Agarwal , John Lambert , Jagjeet Singh , Siddhesh Khandelwal , Bowen Pan , Ratnesh Kumar , Andrew Hartnett , Jhony Kaesemodel Pontes

分类：计算机视觉 | 人工智能 | 机器学习 | 机器人

2023-01-02

We introduce Argoverse 2 (AV2) - a collection of three datasets for perception and forecasting research in the self-driving domain. The annotated Sensor Dataset contains 1,000 sequences of multimodal data, encompassing high-resolution imagery from seven ring cameras, and two stereo cameras in addition to lidar point clouds, and 6-DOF map-aligned pose. Sequences contain 3D cuboid annotations for 26 object categories, all of which are sufficiently-sampled to support training and evaluation of 3D perception models. The Lidar Dataset contains 20,000 sequences of unlabeled lidar point clouds and map-aligned pose. This dataset is the largest ever collection of lidar sensor data and supports self-supervised learning and the emerging task of point cloud forecasting. Finally, the Motion Forecasting Dataset contains 250,000 scenarios mined for interesting and challenging interactions between the autonomous vehicle and other actors in each local scene. Models are tasked with the prediction of future motion for "scored actors" in each scenario and are provided with track histories that capture object location, heading, velocity, and category. In all three datasets, each scenario contains its own HD Map with 3D lane and crosswalk geometry - sourced from data captured in six distinct cities. We believe these datasets will support new and existing machine learning research problems in ways that existing datasets do not. All datasets are released under the CC BY-NC-SA 4.0 license.

translated by 谷歌翻译

Mapping smallholder cashew plantations to inform sustainable tree crop expansion in Benin

Leikun Yin , Rahul Ghosh , Chenxi Lin , David Hale , Christoph Weigl , James Obarowski , Junxiong Zhou , Jessica Till , Xiaowei Jia , Troy Mao

分类：计算机视觉 | 机器学习

2023-01-01

Cashews are grown by over 3 million smallholders in more than 40 countries worldwide as a principal source of income. As the third largest cashew producer in Africa, Benin has nearly 200,000 smallholder cashew growers contributing 15% of the country's national export earnings. However, a lack of information on where and how cashew trees grow across the country hinders decision-making that could support increased cashew production and poverty alleviation. By leveraging 2.4-m Planet Basemaps and 0.5-m aerial imagery, newly developed deep learning algorithms, and large-scale ground truth datasets, we successfully produced the first national map of cashew in Benin and characterized the expansion of cashew plantations between 2015 and 2021. In particular, we developed a SpatioTemporal Classification with Attention (STCA) model to map the distribution of cashew plantations, which can fully capture texture information from discriminative time steps during a growing season. We further developed a Clustering Augmented Self-supervised Temporal Classification (CASTC) model to distinguish high-density versus low-density cashew plantations by automatic feature extraction and optimized clustering. Results show that the STCA model has an overall accuracy of 80% and the CASTC model achieved an overall accuracy of 77.9%. We found that the cashew area in Benin has doubled from 2015 to 2021 with 60% of new plantation development coming from cropland or fallow land, while encroachment of cashew plantations into protected areas has increased by 70%. Only half of cashew plantations were high-density in 2021, suggesting high potential for intensification. Our study illustrates the power of combining high-resolution remote sensing imagery and state-of-the-art deep learning algorithms to better understand tree crops in the heterogeneous smallholder landscape.

translated by 谷歌翻译

Linear programming word problems formulation using EnsembleCRF NER labeler and T5 text generator with data augmentations

JiangLong He , Mamatha N , Shiv Vignesh , Deepak Kumar , Akshay Uppal

分类：自然语言处理 | 人工智能

2022-12-30

We propose an ensemble approach to predict the labels in linear programming word problems. The entity identification and the meaning representation are two types of tasks to be solved in the NL4Opt competition. We propose the ensembleCRF method to identify the named entities for the first task. We found that single models didn't improve for the given task in our analysis. A set of prediction models predict the entities. The generated results are combined to form a consensus result in the ensembleCRF method. We present an ensemble text generator to produce the representation sentences for the second task. We thought of dividing the problem into multiple small tasks due to the overflow in the output. A single model generates different representations based on the prompt. All the generated text is combined to form an ensemble and produce a mathematical meaning of a linear programming problem.

translated by 谷歌翻译